284 research outputs found

    Node-Oriented Workflow (NOW): A Command Template Workflow Management Tool for High Throughput Data Analysis Pipelines

    Get PDF
    Next generation sequencing (NGS) systems produce vast quantities of data that require substantial computational resources for typical analysis tasks. In addition, data that are generated by different NGS systems are not homogeneous. Moreover, there are an overwhelming number of tools available for performing typical tasks. Managing NGS workflows involves writing custom scripts that quickly grow in complexity, often resulting in unwieldy workflows that underutilize typical high performance compute resources, and increase the demands of the staff managing these workflows. We present Node-Oriented Workflow (NOW), a dynamic command template workflow engine for high performance distributed computing (HPC) systems. Our system provides a simple-to-use browser-based front end for designing and managing complex workflows. Workflows are configured using a simple browser interface, and are managed by the integrated job engine, which initializes nodes, monitors node status, and processes results of individual jobs across nodes in an HPC configuration. We reduce excessive messaging across nodes by placing the burden on nodes to start tasks in a workflow when dependencies are met, i.e., node oriented workflow. Our system was designed for NGS processing in the clinical research setting, emphasizing user simplicity, tool scalability, minimization of redundancy in workflows, while maximizing throughput in an HPC environment. Furthermore, NOW is not restricted to NGS pipeline management, but can used to manage any computational pipeline

    High-resolution physical map for chromosome 16q12.1-q13, the Blau syndrome locus

    Get PDF
    BACKGROUND: The Blau syndrome (MIM 186580), an autosomal dominant granulomatous disease, was previously mapped to chromosome 16p12-q21. However, inconsistent physical maps of the region and consequently an unknown order of microsatellite markers, hampered us from further refining the genetic locus for the Blau syndrome. To address this problem, we constructed our own high-resolution physical map for the Blau susceptibility region. RESULTS: We generated a high-resolution physical map that provides more than 90% coverage of a refined Blau susceptibility region. The map consists of four contigs of sequence tagged site-based bacterial artificial chromosomes with a total of 124 bacterial artificial chromosomes, and spans approximately 7.5 Mbp; however, three gaps still exist in this map with sizes of 425, 530 and 375 kbp, respectively, estimated from radiation hybrid mapping. CONCLUSIONS: Our high-resolution map will assist genetic studies of loci in the interval from D16S3080, near D16S409, and D16S408 (16q12.1 to 16q13)

    A Sex-Stratified Genome-Wide Association Study of Tuberculosis Using a Multi-Ethnic Genotyping Array

    Get PDF
    Tuberculosis (TB), caused by Mycobacterium tuberculosis, is a complex disease with a known human genetic component. Males seem to be more affected than females and in most countries the TB notification rate is twice as high in males than in females. While socio-economic status, behavior and sex hormones influence the male bias they do not fully account for it. Males have only one copy of the X chromosome, while diploid females are subject to X chromosome inactivation. In addition, the X chromosome codes for many immune-related genes, supporting the hypothesis that X-linked genes could contribute to TB susceptibility in a sex-biased manner. We report the first TB susceptibility genome-wide association study (GWAS) with a specific focus on sex-stratified autosomal analysis and the X chromosome. A total of 810 individuals (410 cases and 405 controls) from an admixed South African population were genotyped using the Illumina Multi Ethnic Genotyping Array, specifically designed as a suitable platform for diverse and admixed populations. Association testing was done on the autosome (8,27,386 variants) and X chromosome (20,939 variants) in a sex stratified and combined manner. SNP association testing was not statistically significant using a stringent cut-off for significance but revealed likely candidate genes that warrant further investigation. A genome wide interaction analysis detected 16 significant interactions. Finally, the results highlight the importance of sex-stratified analysis as strong sex-specific effects were identified on both the autosome and X chromosome

    Whole genome expression profiling reveals a significant role for immune function in human abdominal aortic aneurysms

    Get PDF
    Abstract Background Abdominal aortic aneurysms are a common disorder with an incompletely understood etiology. We used Illumina and Affymetrix microarray platforms to generate global gene expression profiles for both aneurysmal (AAA) and non-aneurysmal abdominal aorta, and identified genes that were significantly differentially expressed between cases and controls. Results Affymetrix and Illumina arrays included 18,057 genes in common; 11,542 (64%) of these genes were considered to be expressed in either aneurysmal or normal abdominal aorta. There were 3,274 differentially expressed genes with a false discovery rate (FDR) ≤ 0.05. Many of these genes were not previously known to be involved in AAA, including SOST and RUNX3, which were confirmed using Q-RT-PCR (Pearson correlation coefficient for microarray and Q-RT-PCR data = 0.89; p-values for differences in expression between AAA and controls for SOST: 4.87 × 10-4 and for RUNX3: 4.33 × 10-5). Analysis of biological pathways, including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG), indicated extreme overrepresentation of immune related categories. The enriched categories included the GO category Immune Response (GO:0006955; FDR = 2.1 × 10-14), and the KEGG pathways natural killer cell mediated cytotoxicity (hsa04650; FDR = 5.9 × 10-6) and leukocyte transendothelial migration (hsa04670; FDR = 1.1 × 10-5). Conclusion Previous studies have provided evidence for the involvement of the immune system in AAA. The current expression analysis extends these findings by demonstrating broad coordinate gene expression in immunological pathways. A large number of genes involved in immune function were differentially expressed in AAA, and the pathway analysis gave these results a biological context. The data provide valuable insight for future studies to dissect the pathogenesis of human AAA. These pathways might also be used as targets for the development of therapeutic agents for AAA

    Analytical approaches to detect maternal/fetal genotype incompatibilities that increase risk of pre-eclampsia

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In utero interactions between incompatible maternal and fetal genotypes are a potential mechanism for the onset or progression of pregnancy related diseases such as pre-eclampsia (PE). However, the optimal analytical approach and study design for evaluating incompatible maternal/offspring genotype combinations is unclear.</p> <p>Methods</p> <p>Using simulation, we estimated the type I error and power of incompatible maternal/offspring genotype models for two analytical approaches: logistic regression used with case-control mother/offspring pairs and the log-linear regression used with case-parent triads. We evaluated a real dataset consisting of maternal/offspring pairs with and without PE for incompatibility effects using the optimal analysis based on the results of the simulation study.</p> <p>Results</p> <p>We identified a single coding scheme for the incompatibility effect that was equally or more powerful than all of the alternative analysis models evaluated, regardless of the true underlying model for the incompatibility effect. In addition, the log-linear regression was more powerful than the logistic regression when the heritability was low, and more robust to adjustment for maternal or fetal effects. For the PE data, this analysis revealed three genes, lymphotoxin alpha (<it>LTA</it>), von Willebrand factor (<it>VWF</it>), and alpha 2 chain of type IV collagen (<it>COL4A2</it>) with possible incompatibility effects.</p> <p>Conclusion</p> <p>The incompatibility model should be evaluated for complications of pregnancy, such as PE, where the genotypes of two individuals may contribute to the presence of disease.</p

    Systematic Review of Genetic Factors in the Etiology of Esophageal Squamous Cell Carcinoma in African Populations

    Get PDF
    Background: Esophageal squamous cell carcinoma (ESCC), one of the most aggressive cancers, is endemic in Sub-Saharan Africa, constituting a major health burden. It has the most divergence in cancer incidence globally, with high prevalence reported in East Asia, Southern Europe, and in East and Southern Africa. Its etiology is multifactorial, with lifestyle, environmental, and genetic risk factors. Very little is known about the role of genetic factors in ESCC development and progression among African populations. The study aimed to systematically assess the evidence on genetic variants associated with ESCC in African populations. Methods: We carried out a comprehensive search of all African published studies up to April 2019, using PubMed, Embase, Scopus, and African Index Medicus databases. Quality assessment and data extraction were carried out by two investigators. The strength of the associations was measured by odds ratios and 95% confidence intervals. Results: Twenty-three genetic studies on ESCC in African populations were included in the systematic review. They were carried out on Black and admixed South African populations, as well as on Malawian, Sudanese, and Kenyan populations. Most studies were candidate gene studies and included DNA sequence variants in 58 different genes. Only one study carried out whole-exome sequencing of 59 ESCC patients. Sample sizes varied from 18 to 880 cases and 88 to 939 controls. Altogether, over 100 variants in 37 genes were part of 17 case-control genetic association studies to identify susceptibility loci for ESCC. In these studies, 25 variants in 20 genes were reported to have a statistically significant association. In addition, eight studies investigated changes in cancer tissues and identified somatic alterations in 17 genes and evidence of loss of heterozygosity, copy number variation, and microsatellite instability. Two genes were assessed for both genetic association and somatic mutation. Conclusions: Comprehensive large-scale studies on the genetic basis of ESCC are still lacking in Africa. Sample sizes in existing studies are too small to draw definitive conclusions about ESCC etiology. Only a small number of African populations have been analyzed, and replication and validation studies are missing. The genetic etiology of ESCC in Africa is, therefore, still poorly defined

    Immunohistochemical Analysis of the Natural Killer Cell Cytotoxicity Pathway in Human Abdominal Aortic Aneurysms

    Get PDF
    Our previous analysis using genome-wide microarray expression data revealed extreme overrepresentation of immune related genes belonging the Natural Killer (NK) Cell Mediated Cytotoxicity pathway (hsa04650) in human abdominal aortic aneurysm (AAA). We followed up the microarray studies by immunohistochemical analyses using antibodies against nine members of the NK pathway (VAV1, VAV3, PLCG1, PLCG2, HCST, TYROBP, PTK2B, TNFA, and GZMB) and aortic tissue samples from AAA repair operations (n = 6) and control aortae (n = 8) from age-, sex- and ethnicity-matched donors from autopsies. The results confirmed the microarray results. Two different members of the NK pathway, HCST and GRZB, which act at different steps in the NK-pathway, were actively transcribed and translated into proteins in the same cells in the AAA tissue demonstrated by double staining. Furthermore, double staining with antibodies against CD68 or CD8 together with HCST, TYROBP, PTK2B or PLCG2 revealed that CD68 and CD8 positive cells expressed proteins of the NK-pathway but were not the only inflammatory cells involved in the NK-pathway in the AAA tissue. The results provide strong evidence that the NK Cell Mediated Cytotoxicity Pathway is activated in human AAA and valuable insight for future studies to dissect the pathogenesis of human AAA

    Evaluating the Accuracy of Imputation Methods in a Five-Way Admixed Population

    Get PDF
    Genotype imputation is a powerful tool for increasing statistical power in an association analysis. Meta-analysis of multiple study datasets also requires a substantial overlap of SNPs for a successful association analysis, which can be achieved by imputation. Quality of imputed datasets is largely dependent on the software used, as well as the reference populations chosen. The accuracy of imputation of available reference populations has not been tested for the five-way admixed South African Colored (SAC) population. In this study, imputation results obtained using three freely-accessible methods were evaluated for accuracy and quality. We show that the African Genome Resource is the best reference panel for imputation of missing genotypes in samples from the SAC population, implemented via the freely accessible Sanger Imputation Server

    Design patterns for the development of electronic health record-driven phenotype extraction algorithms

    Get PDF
    AbstractBackgroundDesign patterns, in the context of software development and ontologies, provide generalized approaches and guidance to solving commonly occurring problems, or addressing common situations typically informed by intuition, heuristics and experience. While the biomedical literature contains broad coverage of specific phenotype algorithm implementations, no work to date has attempted to generalize common approaches into design patterns, which may then be distributed to the informatics community to efficiently develop more accurate phenotype algorithms.MethodsUsing phenotyping algorithms stored in the Phenotype KnowledgeBase (PheKB), we conducted an independent iterative review to identify recurrent elements within the algorithm definitions. We extracted and generalized recurrent elements in these algorithms into candidate patterns. The authors then assessed the candidate patterns for validity by group consensus, and annotated them with attributes.ResultsA total of 24 electronic Medical Records and Genomics (eMERGE) phenotypes available in PheKB as of 1/25/2013 were downloaded and reviewed. From these, a total of 21 phenotyping patterns were identified, which are available as an online data supplement.ConclusionsRepeatable patterns within phenotyping algorithms exist, and when codified and cataloged may help to educate both experienced and novice algorithm developers. The dissemination and application of these patterns has the potential to decrease the time to develop algorithms, while improving portability and accuracy
    • …
    corecore